138 research outputs found

    Efficient pruning of large knowledge graphs

    Get PDF
    In this paper we present an efficient and highly accurate algorithm to prune noisy or over-ambiguous knowledge graphs given as input an extensional definition of a domain of interest, namely as a set of instances or concepts. Our method climbs the graph in a bottom-up fashion, iteratively layering the graph and pruning nodes and edges in each layer while not compromising the connectivity of the set of input nodes. Iterative layering and protection of pre-defined nodes allow to extract semantically coherent DAG structures from noisy or over-ambiguous cyclic graphs, without loss of information and without incurring in computational bottlenecks, which are the main problem of stateof- the-art methods for cleaning large, i.e., Webscale, knowledge graphs. We apply our algorithm to the tasks of pruning automatically acquired taxonomies using benchmarking data from a SemEval evaluation exercise, as well as the extraction of a domain-adapted taxonomy from theWikipedia category hierarchy. The results show the superiority of our approach over state-of-art algorithms in terms of both output quality and computational efficiency

    A Survey of Symbolic Execution Techniques

    Get PDF
    Many security and software testing applications require checking whether certain properties of a program hold for any possible usage scenario. For instance, a tool for identifying software vulnerabilities may need to rule out the existence of any backdoor to bypass a program's authentication. One approach would be to test the program using different, possibly random inputs. As the backdoor may only be hit for very specific program workloads, automated exploration of the space of possible inputs is of the essence. Symbolic execution provides an elegant solution to the problem, by systematically exploring many possible execution paths at the same time without necessarily requiring concrete inputs. Rather than taking on fully specified input values, the technique abstractly represents them as symbols, resorting to constraint solvers to construct actual instances that would cause property violations. Symbolic execution has been incubated in dozens of tools developed over the last four decades, leading to major practical breakthroughs in a number of prominent software reliability applications. The goal of this survey is to provide an overview of the main ideas, challenges, and solutions developed in the area, distilling them for a broad audience. The present survey has been accepted for publication at ACM Computing Surveys. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5Fv

    Conflict-free star-access in parallel memory systems

    Get PDF
    We study conflict-free data distribution schemes in parallel memories in multiprocessor system architectures. Given a host graph G, the problem is to map the nodes of G into memory modules such that any instance of a template type T in G can be accessed without memory conflicts. A conflict occurs if two or more nodes of T are mapped to the same memory module. The mapping algorithm should: (i) be fast in terms of data access (possibly mapping each node in constant time); (ii) minimize the required number of memory modules for accessing any instance in G of the given template type; and (iii) guarantee load balancing on the modules. In this paper, we consider conflict-free access to star templates. i.e., to any node of G along with all of its neighbors. Such a template type arises in many classical algorithms like breadth-first search in a graph, message broadcasting in networks, and nearest neighbor based approximation in numerical computation. We consider the star-template access problem on two specific host graphs-tori and hypercubes-that are also popular interconnection network topologies. The proposed conflict-free mappings on these graphs are fast, use an optimal or provably good number of memory modules, and guarantee load balancing. (C) 2006 Elsevier Inc. All rights reserved

    Efficient pruning of large knowledge graphs

    Get PDF
    In this paper we present an efficient and highly accurate algorithm to prune noisy or over-ambiguous knowledge graphs given as input an extensional definition of a domain of interest, namely as a set of instances or concepts. Our method climbs the graph in a bottom-up fashion, iteratively layering the graph and pruning nodes and edges in each layer while not compromising the connectivity of the set of input nodes. Iterative layering and protection of pre-defined nodes allow to extract semantically coherent DAG structures from noisy or over-ambiguous cyclic graphs, without loss of information and without incurring in computational bottlenecks, which are the main problem of stateof- the-art methods for cleaning large, i.e., Webscale, knowledge graphs. We apply our algorithm to the tasks of pruning automatically acquired taxonomies using benchmarking data from a SemEval evaluation exercise, as well as the extraction of a domain-adapted taxonomy from theWikipedia category hierarchy. The results show the superiority of our approach over state-of-art algorithms in terms of both output quality and computational efficiency

    The Power of Pivoting for Exact Clique Counting

    Full text link
    Clique counting is a fundamental task in network analysis, and even the simplest setting of 33-cliques (triangles) has been the center of much recent research. Getting the count of kk-cliques for larger kk is algorithmically challenging, due to the exponential blowup in the search space of large cliques. But a number of recent applications (especially for community detection or clustering) use larger clique counts. Moreover, one often desires \textit{local} counts, the number of kk-cliques per vertex/edge. Our main result is Pivoter, an algorithm that exactly counts the number of kk-cliques, \textit{for all values of kk}. It is surprisingly effective in practice, and is able to get clique counts of graphs that were beyond the reach of previous work. For example, Pivoter gets all clique counts in a social network with a 100M edges within two hours on a commodity machine. Previous parallel algorithms do not terminate in days. Pivoter can also feasibly get local per-vertex and per-edge kk-clique counts (for all kk) for many public data sets with tens of millions of edges. To the best of our knowledge, this is the first algorithm that achieves such results. The main insight is the construction of a Succinct Clique Tree (SCT) that stores a compressed unique representation of all cliques in an input graph. It is built using a technique called \textit{pivoting}, a classic approach by Bron-Kerbosch to reduce the recursion tree of backtracking algorithms for maximal cliques. Remarkably, the SCT can be built without actually enumerating all cliques, and provides a succinct data structure from which exact clique statistics (kk-clique counts, local counts) can be read off efficiently.Comment: 10 pages, WSDM 202
    • …
    corecore